Goto

Collaborating Authors

 time machine






Meta-DiffuB: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-Exploration

Chuang, Yun-Yen, Hsu, Hung-Min, Lin, Kevin, Gu, Chen-Sheng, Li, Ling Zhen, Chang, Ray-I, Lee, Hung-yi

arXiv.org Artificial Intelligence

The diffusion model, a new generative modeling paradigm, has achieved significant success in generating images, audio, video, and text. It has been adapted for sequence-to-sequence text generation (Seq2Seq) through DiffuSeq, termed S2S Diffusion. Existing S2S-Diffusion models predominantly rely on fixed or hand-crafted rules to schedule noise during the diffusion and denoising processes. However, these models are limited by non-contextualized noise, which fails to fully consider the characteristics of Seq2Seq tasks. In this paper, we propose the Meta-DiffuB framework - a novel scheduler-exploiter S2S-Diffusion paradigm designed to overcome the limitations of existing S2S-Diffusion models. We employ Meta-Exploration to train an additional scheduler model dedicated to scheduling contextualized noise for each sentence. Our exploiter model, an S2S-Diffusion model, leverages the noise scheduled by our scheduler model for updating and generation. Meta-DiffuB achieves state-of-the-art performance compared to previous S2S-Diffusion models and fine-tuned pre-trained language models (PLMs) across four Seq2Seq benchmark datasets. We further investigate and visualize the impact of Meta-DiffuB's noise scheduling on the generation of sentences with varying difficulties. Additionally, our scheduler model can function as a "plug-and-play" model to enhance DiffuSeq without the need for fine-tuning during the inference stage.


How to install the macOS Sequoia public beta

Engadget

About a month after Apple announced it at WWDC 2024, macOS Sequoia is available to test-drive as a public beta. Although we don't recommend installing it on your primary Mac, here's how to get the 2024 version of macOS up and running ahead of its official rollout in the fall. First, you'll need a recent Mac to run the Sequoia public beta. Apple's software supports the following models: You'll notice that list still includes (up to) the last few generations of Intel Macs, so Apple may still be several years away from requiring Apple Silicon for its latest software. However, Apple Intelligence, which isn't yet included in the beta, will require a Mac with an M-series chip when it's available. Macs don't have automatic iCloud system backups like iOS devices, so you'll want to back up your Mac with Time Machine before installing.


Wood you believe it? Fully functional WOODEN car takes to the road - and it looks like something from a science fiction blockbuster

Daily Mail - Science & tech

Real time machines might only exist in science fiction, but this engineer's wooden vehicle certainly looks ready to head back to the future. A woodworker in Vietnam called Truong Van Dao has created a fully-functioning car made out of wood, complete with spinning cogs and pistons. In an incredible video, Mr Van Dao is shown hand-carving each wooden component of his vehicle, which is powered by electric batteries. And although the car doesn't move much faster than walking pace, commenters have been amazed by the intricate and time-consuming design. The charming contraption is reminiscent of Baldrick's working time machine from Blackadder – although this four-wheeled unit only travels through space, not time.


5 things to do first if you got a new Mac

FOX News

CyberGuy explains how Walmart is using artificial intelligence to enhance the shopping experience. You know that feeling when you unbox a new Mac for the first time? You can't help but admire how sleek and smooth it looks and how bright and beautiful it glows when you turn it on. And don't get me started on those crisp and clean keys that make typing a breeze. Before we jump into all the cool stuff you can do with your Mac, there are some important things you need to set up first.


Multitask Multimodal Prompted Training for Interactive Embodied Task Completion

Pantazopoulos, Georgios, Nikandrou, Malvina, Parekh, Amit, Hemanthage, Bhathiya, Eshghi, Arash, Konstas, Ioannis, Rieser, Verena, Lemon, Oliver, Suglia, Alessandro

arXiv.org Artificial Intelligence

Interactive and embodied tasks pose at least two fundamental challenges to existing Vision & Language (VL) models, including 1) grounding language in trajectories of actions and observations, and 2) referential disambiguation. To tackle these challenges, we propose an Embodied MultiModal Agent (EMMA): a unified encoder-decoder model that reasons over images and trajectories, and casts action prediction as multimodal text generation. By unifying all tasks as text generation, EMMA learns a language of actions which facilitates transfer across tasks. Different to previous modular approaches with independently trained components, we use a single multitask model where each task contributes to goal completion. EMMA performs on par with similar models on several VL benchmarks and sets a new state-of-the-art performance (36.81% success rate) on the Dialog-guided Task Completion (DTC), a benchmark to evaluate dialog-guided agents in the Alexa Arena


LIMA: Less Is More for Alignment

Zhou, Chunting, Liu, Pengfei, Xu, Puxin, Iyer, Srini, Sun, Jiao, Mao, Yuning, Ma, Xuezhe, Efrat, Avia, Yu, Ping, Yu, Lili, Zhang, Susan, Ghosh, Gargi, Lewis, Mike, Zettlemoyer, Luke, Levy, Omer

arXiv.org Artificial Intelligence

Large language models are trained in two stages: (1) unsupervised pretraining from raw text, to learn general-purpose representations, and (2) large scale instruction tuning and reinforcement learning, to better align to end tasks and user preferences. We measure the relative importance of these two stages by training LIMA, a 65B parameter LLaMa language model fine-tuned with the standard supervised loss on only 1,000 carefully curated prompts and responses, without any reinforcement learning or human preference modeling. LIMA demonstrates remarkably strong performance, learning to follow specific response formats from only a handful of examples in the training data, including complex queries that range from planning trip itineraries to speculating about alternate history. Moreover, the model tends to generalize well to unseen tasks that did not appear in the training data. In a controlled human study, responses from LIMA are either equivalent or strictly preferred to GPT-4 in 43% of cases; this statistic is as high as 58% when compared to Bard and 65% versus DaVinci003, which was trained with human feedback. Taken together, these results strongly suggest that almost all knowledge in large language models is learned during pretraining, and only limited instruction tuning data is necessary to teach models to produce high quality output.